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We propose a censored quantile regression estimator motivated by unbiased estimating equa- 
tions. Under the usual conditional independence assumption of the survival time and the cen- 
soring time given the covariates, we show that the proposed estimator is consistent and asymp- 
totically normal. We develop an efficient computational algorithm which uses existing quantile 
regression code. As a result, bootstrap-type inference can be efficiently implemented. We illus- 
trate the finite-sample performance of the proposed method by simulation studies and analysis 
of a survival data set. 
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1. Introduction 

Censored data arise frequently in biomedical, psychological, social studies and many 
other applied fields (Kalbficisch and Prentice [7]). Analysis of such data is complicated 
by censoring, where an object's time-to-death or other end-point of interest is known to 
occur only in a certain period of time. 

Denote T as the survival time and C < 7o as the censoring time, where 7o is the largest 
follow-up study time. The typical censored data set consists of independent observations 
(Yi, Si, Zi). i = 1, . . . ,n, where Yi = min(Tj, Cj) is the observed failure time; Si = Z(T, < 
Ci) is the censoring indicator; and Zi is a p-dimcnsional covariate vector including an 
intercept. The accelerated failure time model (AFT), specified as Tj = f3 T Zi + Ei with 
Si,i = l,...,n, following a common distribution independently, was studied in a number 
of papers (Jin et al. [5]; Zeng and Lin [26]). However, the assumption made for the 
AFT model precludes error heteroscedasiticity and will yield biased results when it is 
inappropriate. 
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Quantile regression introduced by Koenker and Bassctt [10] has become an increas- 
ingly important tool in statistical analysis. Contrary to the usual model for the condi- 
tional mean, it provides distributional information on the dependence of T on Z . A com- 
prehensive review can be found in Koenker [8]. The usefulness of quantile regression 
in survival analysis was discussed by Koenker and Geiling [11]. The rth conditional 
quantile function of the dependent variable T given covariates Z, Qt(t\Z), is defined 
as Qt(t\Z) = inf{u: Fq(v\Z) > r}, where Fq is the cumulative conditional distribution 
function of T given Z. Correspondingly, a quantile regression model for Qt(t\Z) with 
r € (0, 1) can be denoted as 

Qt{t\Z)=P' t Z. (1) 

Note that the AFT model is a special case of this model when /3i, the coefficient cor- 
responding to the intercept, is quantile dependent, but the other coefficients in f3 are 
quantile independent. Another example is the location-scale model Tj = erf Zi + [a^Zi)Si 
in which /? = a.\ + o.2F~ 1 (t) when e$ is independent of Zi, and F e is the distribution 
function of £j. 

When data are subject to censoring, statistical estimation and inference for quantile 
regression is more involved. Indeed, a naive procedure which completely ignores censoring 
may give highly biased estimates (Koenker [8]). The situation is more complicated if 
censoring time depends on the covariates. In Section 5, we present the Colorado Plateau 
uranium miners cohort data (Lubin et al. [14], Langholz and Goldstein [12]), where the 
major interest of this study is to assess the effect of smoking and radon exposure on the 
rate of median death time of lung cancer. It is found that the censoring time is highly 
correlated with the covariates. Ignoring this dependence may yield biased estimates; see 
the Numerical Study section for examples. 

Powell [19, 20] first studied censored quantile regression with fixed censoring. For 
random censoring, Ying, Jung and Wei [25] (YJW) proposed a semiparametric median 
regression model. Despite the simplicity of the method in YJW, this procedure requires 
the unconditional independence of the survival time and censoring time. This assump- 
tion is often restrictive as conditional independence, given the covariates, is more natural 
(Kalbflcisch and Prentice [7]). In addition, the estimating equation approach proposed in 
YJW involves solving non-monotone discrete equations, creating difficulty for optimiza- 
tion. As a consequence, inferential procedures such as the resampling approach in Jin, 
Ying and Wei [6] , or the bootstrap method, can be prohibitive computationally. See also 
Leon, Cai and Wei [13] for a generalization of this method to partly linear models. 

Relaxing the independence condition to conditional independence, Portnoy [17] and 
Neocleous, Vanden Brandan and Portnoy [15] developed a novel estimating approach 
motivated by the classical Kaplan-Meier estimator in the one sample analysis. Using 
the martingale representation, Peng and Huang [16] studied another censored quantile 
regression estimator motivated by the Nelson- Aalen estimator. However, a major short- 
coming of Portnoy and Peng and Huang's methods is that a global linear assumption 
has to be made, even for estimating the quantile coefficient at a single quantile. To relax 
this assumption, Wang and Wang [22] recently proposed an innovative redistribution of 
mass idea, which employs local weighting. 
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Motivated by the unbiased estimating equation for the quantile regression (Ying et 
al. [25]), we propose a new quantile regression estimator. Under the usual conditional 
independence assumption of T and C given Z, we show that the proposed estimator is 
consistent and asymptotically normal. We develop an efficient algorithm which utilizes 
existing quantile regression code for estimation. The efficient code enables us to use 
the bootstrap procedure for statistical inference. Our method provides an alternative to 
Wang and Wang's locally weighted censored quantile regression. However, the framework 
proposed by Ying et al., and used by us, may be conceptually simpler. 

The rest of the paper is organized as follows. Section 2 discusses the new estimator 
and the fast computing algorithm. Section 3 provides the theoretical results of the new 
estimator. Some numerical studies are presented in Section 4. A data analysis is provided 
in Section 5. Some concluding remarks are given in Section 6. All the proofs are relegated 
to the Appendix. When no confusion arises, the dependence of /3 on t is suppressed. 

2. Censored quantile regression 

To estimate /3 T in (1) for the rth quantile, we propose to solve the following estimating 
equation: 



where /(■) is the indicator function, and G is the Kaplan-Meier estimate for Go(-\Zi), 
the conditional survival function of the censoring variable C given the covariates. The 
estimating equation in (2) is motivated by the fact that E[I(Yi — /3'Zi > 0)\Zi] = (1 — 
t)Gq(P' Zi\Zi) using the conditional independence of Ti and Cj given Z^. YJW assumes 
that Go(j3' Zi\Zi) = Go(j3'Zi), and our formulation is an extension of YJW's median 
regression to quantile regression by allowing Go to depend on Z . To solve (2), we need to 
find its root. Ying et al. [25] proposed to minimize ||M„(/3)|| , a discrete and non-monotone 
function. Computational complication naturally arises. Ying et al. proposed to use the 
simulated annealing algorithm, or simply the bisection algorithm, to solve the estimating 
equation, which is computationally demanding. Another complication arises in statistical 
inference. Since the sampling distribution of the solution involves the unknown density 
functions of the data, resampling-based approaches are effective tools for conducting 
statistical inference (Jin, Lin, Wei and Ying [5]). However, inference procedures via these 
methods would be computationally even more intensive than point estimation, due to 
the lack of an efficient algorithm. 

We start by proposing a new algorithm to solve (2). First note that we can write the 
estimating equation in (2) as 






n 



n 
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where we write G(/3' Zi\Zi) for a preliminary estimate of (3 as Gi for brevity. In practice, 
we set I(Yi — (3 T Zi > 0)/Gi = if Gi = as recommended by Ying et al. [25]. The solution 
to this function is the minimizcr of the following linear programming problem: 

n 

S n (J3) = G^ipAYi - fZ % ) + p T (Y* - p T Z t {G t - 1))}, (3) 

i—l 

where p T (s) = s[I(s > 0) — (1 — r)] is the check loss function used in quantile regression, 
and Y* is a small constant less than — \(3 T Zi(Gi — 1)| for all (3's in a compact space. In this 
paper, we set Y* = Y* = min{y;} — A with A = 200. This new formulation suggests that 
the fast quantile regression code (Portnoy and Koenker [18], Kocnker [8]) can be directly 
used to solve the censored quantile regression defined by (2). We note that Yin and 
Cai [23] used the Nelder-Mead simplex algorithm when T and C are independent. The 
simplex algorithm is generally much slower than the interior point algorithm specially 
developed for quantile regression (Portnoy and Koenker [18]). 

For estimating the weighting function Gq(/3' Zi\Zi), we propose to use the local Kaplan- 
Meier estimator G(j3' Zi\Zi). To be specific, Go(-\Zi) is estimated by 



G(t\ Z )=n 

i=i 



B nj (z) 



-, I(Yj<t,Sj=Q) 



(4) 



where B n j(z) is a sequence of non- negative weights adding up to 1. When B n j(z) = l/n 
for all j, Go(t\z) reduces to the classical Kaplan-Meier estimator of the survival function 
in the one-sample case. Following the idea of Wang and Wang [22], we use 



B nj (z)=K 



,k=l 



h„ 



(5) 



where K(-) is a density function, and h n > is the bandwidth. This is the familiar ker- 
nel estimator for the survival function discussed, for example, in Gonzalez-Manteiga and 
Cadarso-Suarcz [4]. When Z is multi-dimensional, we can use the product kernel. For 
example, in the bivariate case, we can use K(z\,Z2) = K\(z\)K%{Z'i) where K\{-) and 
^2( - ) are both univariate kernel functions. In this article, we use the bi-quadratic kernel, 
defined as K(s) = — s 2 ) 2 I(\s\ < 1), which is also used by Wang and Wang [22]. Alter- 
natively, we may use a multivariate density function, for example, from the multivariate 
normal distribution (Fan and Gijbels [3]). 

Since G(fi' Z^\Zi) depends on the unknown parameter /3, we propose the following 
iterative algorithm between solving for /3 and G(j3' Zi\Zi), while the other one is fixed: 

1. Given an initial estimate of j3 denoted as set k = 0. 

2. Estimate Go(^/3 (fe) |Zi) as G l using the local Kaplan-Meier estimator. Minimize 
S n {l3) in (3) to obtain /3 ( - k+1 K 

3. Set k k + 1. Go to Step 2 until a convergence criterion is met. 
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For the initial estimate, we use a similar method as in (Yin and Cai [23]) to solve the 
following monotone estimating function: 

n r 

where G(Yi\Zi) is the local Kaplan-Meier estimator. This estimator can be seen as the 
inverse probability weighted quantile regression function (Bang and Tsiatis [1]). Similarly 
to (Yin and Cai [23]), consistency of p(°> can be shown, and convergence of the solution 
series {(3^} follows by the method of induction. Note that, although the initial estimate 
is also reasonable, its efficiency is adversely affected by the fact that only non-censored 
observations are used. 

Remark 1. We note that our estimator requires estimating Gq(/3' Zi\Z{). In compari- 
son, Wang and Wang [22] estimated the conditional cumulative distribution function of 
the survival time Tj, evaluated at Ci, given the covariate Z±. Both estimators use local 
Kaplan-Meier estimation. Since both methods are estimation equation based approaches, 
which one is more efficient is likely dependent on the particular problem under analysis. 
We observe empirically that the proposed method performs satisfactorily even if the cen- 
soring rate is reasonably low. Of course, if the censoring rate is very low, the proposed 
method ultimately suffers due to the low sample size used for the local Kaplan-Mcicr 
estimate. 

We briefly discuss the computation issue before presenting the asymptotic results. The 
estimation problem in (3) is essentially weighted quantile regression after the weights 
Go(Z' i f3\Zi) are estimated using the local Kaplan-Meier method. This method can be 
easily implemented by extending the Kaplan-Meier estimate for the survival function 
of C. In particular, we augment observations (Y*,Zi(Gi — 1)) with weights 1/Gi to the 
existing data set (Yi,Zi) with weight 1/Gi. Wc then apply function rq in R library 
quantreg on the augmented data set using the weights to fit a regular quantile regression 
model. This process has to be iterated since /3 in Go(Z' i /3\Zi) is unknown. The iteration is 
initialized by using the inverse probability estimator. For the examples in the simulation 
part and the data analysis, this iteration is very fast. Convergence is achieved usually 
in a few iterations for a reasonable convergence criterion. Note that we need to estimate 
Go(Z' i P\Zi) at each iteration, while no iteration is needed for Wang and Wang's algorithm. 

3. Asymptotic theory 

We establish the consistency and the asymptotic normality of the estimator in this sec- 
tion. To derive the asymptotic properties of the proposed estimator, we require the 
following regularity assumptions. For convenience, we write the true value of /3 as /3q. 
The regularity conditions are listed as follows: 

CI. T and C arc conditionally independent given the covariate Z . 
C2. The true value /3q of (3 is in the interior of a bounded convex region B. The support 
Z of Z is bounded. 
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C3. infzez P(Y >T\Z)> m > 0, where T = To V sup ZezpeB Z'f3. 

C4. The conditional density functions fo(t\z) and go{t\z) of the failure time T and C, 
respectively, are uniformly bounded away from infinity and have bounded (uni- 
formly in t) second order partial derivatives with respect to z. 

C5. The bandwidth h n satisfies h n = 0(n~ v ) with < v < 1/2. 

C6. The kernel function K(-) > is compactly supported and satisfies the Lip- 
schitz condition of order 1, J K(u) = 1,J uK(u)du = 0,J K 2 (u)du < oo and 
J\\u\\ 2 K(u)du<oo. 

C7. For j3 in a neighborhood of /3q, E[ZZ' fo(Z' (3\Z)] is positive definite. 

Assumptions C1-C4 are standard in survival analysis. Assumption C5 is needed to en- 
sure the consistency of the local Kaplan-Meier estimator. Assumption C6 is routinely 
made in nonparametric smoothing, and assumption C7 ensures a unique solution for 
the limiting estimating equation in the neighborhood of /3q and is used to derive the 
asymptotic properties of the estimator. Intuitively, if G{fi' Z^\Z%) is a reasonable estima- 
tor of G(j3' Zi\Zi), the consistency of follows from the unbiascdness of the estimating 
equation (2). Formally, we have the following results for the consistency of the estimator. 

Theorem 1 (Consistency). Under conditions C1-C7, we have that f3 n —> ffo in prob- 
ability as n — > oo. 

The proof of this theorem uses the uniform consistency of G as an estimator of G and 
is similar to that in Ying et al. [25] . Since the criterion function is not smooth, we make 
use of the general theorem developed by Chen, Linton and Van Keilegom [2] to show the 
asymptotic normality of the resulting estimator. 

Theorem 2 (Asymptotic normality). Under conditions C1-C7, if 1/4 < v < 1/3, 
then we have that 

n 1 / 2 (/3„- ) 8o)4iV(0,r^Vr^ 1 ), 

where T\ = EZZ' fo(Z' (3q\Z) and V = cov(V^) with Vi defined in Lemma A. 3 in the 
Appendix. 

Note that this theorem is only for problems with a single covariate. As in Wang and 
Wang [22], we observe that the results are not very sensitive to h n . In practice, we can 
use K-folds cross validation to choose the bandwidth. This approach works by dividing 
the data set in K parts, which are about equally sized. For the fcth part, we use the 
rest K — 1 parts of the data to fit the model, and then evaluate the quantile loss from 
predicting the rth conditional quantile of T on the uncensored data that are left out. 
Averaging over k — 1, . . . , K , we choose the h that gives the minimum average quantile 
loss. 

The matrices Ti and V in the limiting covariancc matrix depend on the unknown con- 
ditional density function fo(-\z) and g (-\z). For censored data, they may not be estimated 
well nonparamctrically with finite sample. Thus we use the bootstrap resampling proce- 
dure for inference. The validity of this procedure can be shown following Jin et al. [5]. We 



Censored quantile regression 



7 



note that for the bootstrap or other resampling methods to be feasible computationally, 
efficient algorithms are instrumental because a large number of bootstrap replications 
are needed. 

4. Numerical study 

For simulation study, we compare the estimator of Ying et al. (YJW), the proposed 
estimator (CQR), the locally weighted censored quantile regression estimator (Lcrq) in 
Wang and Wang [22], the estimator in Portnoy [17], abbreviated as Port, and the es- 
timator by Peng and Huang [16], abbreviated as PH. We use Wang and Wang's code, 
available on their websites, for Lcrq and function crq in R library quantreg for Portnoy's 
and Peng and Huang's method. YJW is implemented via the iterative method in this 
paper by replacing G(f3' Zi\Zi) in (2) by G((3'Zi), which is the Kaplan-Meier estimate 
for the survival function of C. We follow Yin and Cai [23] and Zhou [27] to obtain the 
initial value of f3 by using weights GiYi) in (2). A justification of this algorithm can be 
found in Yin and Cai [23]. Note that the local Kaplan-Meier estimate G((3' Zi\Zi) can be 
obtained by simply modifying Wang and Wang's R function for their local Kaplan-Meier 
estimation. 

We compare the mean bias (MB), the median absolute error (MAE) and the root 
mean square errors (RMSE) of these procedures (Kocnkcr [9]). We fix h = 0.05 for all 
the simulations as suggested by Wang and Wang [22]. Other choices of the bandwidth 
were also tried. The results are similar and are omitted to save space. We investigate the 
performance at the median r = 0.5 for two sample sizes n = 100 and 200. For Examples 1 
and 2, we have also examined the performance at t = 0.7, and the results are similar. For 
each setup, the simulation is repeated 500 times. And we use 400 bootstrap replications 
for inference. 

Example 1. We take the first example from Wang and Wang [22] to generate failure 
time from the following i.i.d. error model 

T i =b + b 1 Zi + e i , i = l,...,n, 

where bo = 3, b\ = 5, Z ~ U(0, 1) and Ej = r\i — $ _1 (t) with {?7i}™ =1 being i.i.d. standard 
normal random variables. The censoring variable is either generated from U(0, 14) or 
£7(0,36) such that about 40% or 15% of the observations are censored at the median, 
when r = 0.5 is used for generating e. 

Example 2. This example is again taken from Wang and Wang [22]. The data are 
generated from 

Ti = b + hzi + (0.2 + a(zi - 0.5) 2 )e 4 , i = 1, . . . ,n, 

where bo = 2, b\ = 1, Zi ~ N(0, 1) and e, = rji — $ _1 (r) with {rji}" =1 being i.i.d. standard 
normal random variables. Here a takes the value 0, 0.5 and 2 to indicate no, median 
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and strong deviation from the global linearity assumption. The censoring variable Ci is 
generated from {7(0, 7) or [7(0,18) to give 40% and 15% censoring at the median for 
a = 2. 

Example 3. The model to generate Tj is the same as Example 2 with bo = 1 and a = 2. 
The censoring variable Cj is generated from a mixture of distributions. Specifically, if 
Zi < 1, Ci is generated from [7(0,4); otherwise, Ci is generated from 17(0,8). This scheme 
gives about 30% censoring at the median. 

Example 4- This model is similar to Example 2 with a = 2. However, the censoring time 
Ci is generated from the following model Ci = A + bzi + rji,i = 1, . . . , n, where b — 0, 0.5, 1 
indicates a different level of dependence of the censoring time on the covariates. The 
random variable rji follows the standard normal distribution, and A is either 1.35 or 2.6, 
such that when b = l, about 40% or 15% observations are censored. 

For Examples 1 and 2, the censoring time and the survival time are independent. In 
Example 1, all the conditional quantilcs are linear functions of the covariates. Example 2 
gives a model with only the rth quantile being a linear function when a ^ 0. Note that 
Wang and Wang [22] have shown that Portnoy's approach gives biased estimates for the 
coefficients when a — 2. For Examples 3 and 4, T and C are not independent, but they are 
conditionally independent given the covariates. Example 3 uses a mixture distribution to 
generate censoring time, while in Example 4, a linear dependence of C on the covariates 
is used. Two different censoring rates are examined for Examples 1, 2 and 4. 

It is not difficult to see that the initial estimate is also -^/n-consistent. However, 
we observe empirically that it is less efficient than the final estimate after iteration. For 
example, in Example 2 when n = 100, a = 2 and censoring rate is 40%, the RMSEs of 
are 0.438 and 0.697 when censoring rate is 40%, and 0.246 and 0.450 when about 
15% of the data are censored. A related comparison was made in Yin, Zeng and Li [24]. 
The results for the other methods are summarized in Tables 1 and 2 when r = 0.5 and 
n = 100. The results for t = 0.7 orn = 200 arc qualitatively similar and thus are omitted. 
We have the following observations. First, CQR outperforms YJW in general, especially 
when the unconditional independence is violated. Second, when the global linearity holds, 
Port and PH generally outperform CQR and Lcrq, although by a small margin. When 
the global linearity is mildly violated, PH and Port both perform competitively with 
CQR and Lcrq. This demonstrates the robustness of PH and Port. However, when this 
assumption is severely violated, CQR and Lcrq perform better in general, especially in 
terms of RMSE. However, how much improvement can be expected is likely dependent on 
a number of factors, such as the censoring mechanism and rates. Third, CQR and Lcrq 
have similar performance across all the simulations. The difference between these two 
approaches is usually negligible. Fourth, when the censoring is low (15%), CQR performs 
competitively compared to Lcrq, indicating its robustness with respect to the required 
sample size for estimating the local Kaplan-Meier curve. We conclude that when the 
global linearity is violated, and C is not unconditionally independent of T, the proposed 
method is preferred over YJW, Port and PH. 
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HjX. C/O 


a 




Bias 




MAE 




RMSE 




Po 


Pi 


Po 


Pi 


Po 


Pi 


1 40% 




YJW 


0.010 


-0.015 


0.228 


0.431 


0.341 


0.644 






CQR 


-0.007 


-0.092 


0.211 


0.392 


0.305 


0.583 






Lcqr 


-0.009 


-0.018 


0.196 


0.375 


0.299 


0.554 






Port 


-0.042 


-0.003 


0.198 


0.381 


0.299 


0.548 






PH 


0.019 


0.005 


0.203 


0.380 


0.299 


0.558 


15% 




YJW 


-0.016 


0.023 


0.191 


0.330 


0.293 


0.508 






CQR 


-0.012 


-0.012 


0.186 


0.317 


0.280 


0.481 






Lcqr 


-0.013 


0.005 


0.186 


0.311 


0.281 


0.485 






Port 


-0.059 


0.007 


0.179 


0.298 


0.279 


0.469 






PH 


0.001 


0.007 


0.186 


0.305 


0.277 


0.477 


2 40% 


2 


YJW 


0.007 


-0.009 


0.159 


0.326 


0.249 


0.537 






CQR 


-0.060 


-0.030 


0.139 


0.267 


0.211 


0.393 






Lcqr 


-0.053 


0.008 


0.144 


0.272 


0.215 


0.406 






Port 


-0.021 


-0.010 


0.164 


0.308 


0.224 


0.443 






PH 


0.058 


-0.119 


0.166 


0.304 


0.235 


0.460 




0.5 


YJW 


-0.001 


0.012 


0.058 


0.125 


0.089 


0.188 






CQR 


-0.044 


-0.016 


0.064 


0.106 


0.095 


0.154 






Lcqr 


-0.021 


0.009 


0.058 


0.106 


0.088 


0.164 






Port 


-0.020 


0.012 


0.058 


0.109 


0.089 


0.168 






PH 


0.013 


-0.014 


0.057 


0.106 


0.087 


0.169 







YJW 


0.003 


0.008 


0.022 


0.025 


0.033 


0.040 






CQR 


-0.022 


-0.012 


0.028 


0.024 


0.041 


0.037 






Lcqr 


-0.004 


-0.002 


0.021 


0.022 


0.031 


0.034 






Port 


-0.010 


0.001 


0.021 


0.021 


0.032 


0.033 






PH 


0.003 


0.001 


0.021 


0.021 


0.031 


0.033 


15% 


2 


YJW 


-0.006 


0.012 


0.146 


0.292 


0.212 


0.425 






CQR 


-0.024 


-0.004 


0.134 


0.267 


0.202 


0.393 






Lcqr 


-0.023 


0.013 


0.138 


0.271 


0.202 


0.396 






Port 


-0.051 


0.039 


0.141 


0.277 


0.214 


0.410 






PH 


0.021 


-0.038 


0.143 


0.288 


0.208 


0.406 




0.5 


YJW 


0.001 


0.000 


0.055 


0.100 


0.082 


0.150 






CQR 


-0.011 


-0.008 


0.055 


0.092 


0.082 


0.141 






Lcqr 


-0.005 


0.000 


0.052 


0.092 


0.081 


0.144 






Port 


-0.025 


0.014 


0.056 


0.091 


0.085 


0.144 






PH 


0.008 


-0.008 


0.054 


0.094 


0.081 


0.144 







YJW 


0.001 


0.003 


0.018 


0.019 


0.027 


0.029 






CQR 


-0.007 


-0.003 


0.020 


0.019 


0.028 


0.028 






Lcqr 


-0.001 


-0.000 


0.019 


0.018 


0.027 


0.027 






Port 


-0.011 


-0.000 


0.020 


0.018 


0.029 


0.027 






PH 


0.001 


0.000 


0.018 


0.018 


0.026 


0.027 
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Table 2. Simulation results for Examples 3 and 4 



Ex. c/o 


a 




Bias 




MAE 




RMSE 




o 

Po 


Q 

Pi 


Q 

Po 


Pi 


Q 

Po 


Pi 


3 30% 




YJW 


0.035 


0.544 


0.124 


0.482 


0.206 


0.810 






CQR 


0.005 


-0.023 


0.115 


0.223 


0.164 


0.325 






Lcrq 


0.011 


0.005 


0.109 


0.234 


0.164 


0.333 






Port 


0.020 


0.017 


0.124 


0.257 


0.190 


0.367 






PH 


0.088 


-0.058 


0.128 


0.261 


0.229 


0.388 


4 30% 


1 


YJW 


-0.115 


0.421 


0.121 


0.411 


0.159 


0.473 






CQR 


-0.077 


0.081 


0.095 


0.137 


0.137 


0.217 






Lcqr 


-0.080 


0.083 


0.095 


0.138 


0.139 


0.218 






Port 


-0.029 


0.024 


0.086 


0.140 


0.131 


0.226 






PH 


0.017 


-0.020 


0.085 


0.148 


0.130 


0.222 




0.5 


YJW 


-0.063 


0.227 


0.0944 


0.228 


0.137 


0.331 






CQR 


-0.055 


-0.011 


0.090 


0.155 


0.131 


0.217 






Lcqr 


-0.051 


-0.010 


0.090 


0.153 


0.129 


0.222 






Port 


-0.018 


-0.022 


0.093 


0.178 


0.136 


0.251 






PH 


0.035 


-0.089 


0.090 


0.189 


0.141 


0.267 







YJW 


0.000 


-0.020 


0.096 


0.207 


0.143 


0.311 






CQR 


-0.042 


-0.081 


0.088 


0.191 


0.127 


0.264 






Lcqr 


-0.040 


-0.056 


0.086 


0.199 


0.129 


0.279 






Port 


-0.020 


-0.037 


0.088 


0.207 


0.135 


0.306 






PH 


0.030 


-0.111 


0.092 


0.234 


0.135 


0.319 


15% 


1 


YJW 


-0.020 


0.226 


0.079 


0.226 


0.117 


0.323 






CQR 


-0.014 


0.023 


0.078 


0.148 


0.114 


0.215 






Lcqr 


-0.015 


0.021 


0.076 


0.146 


0.114 


0.213 






Port 


-0.038 


0.029 


0.084 


0.155 


0.125 


0.227 






PH 


0.007 


-0.007 


0.078 


0.152 


0.118 


0.223 




0.5 


YJW 


-0.017 


0.157 


0.081 


0.187 


0.123 


0.290 






CQR 


-0.004 


-0.015 


0.078 


0.141 


0.119 


0.218 






Lcqr 


-0.002 


-0.021 


0.077 


0.144 


0.119 


0.216 






Port 


-0.034 


0.008 


0.088 


0.145 


0.130 


0.225 






PH 


0.014 


-0.036 


0.081 


0.143 


0.124 


0.226 







YJW 


0.000 


-0.001 


0.083 


0.167 


0.123 


0.260 






CQR 


-0.000 


-0.052 


0.080 


0.156 


0.119 


0.235 






Lcqr 


0.000 


-0.058 


0.079 


0.160 


0.119 


0.237 






Port 


-0.031 


-0.021 


0.085 


0.157 


0.126 


0.237 






PH 


0.017 


-0.077 


0.079 


0.166 


0.123 


0.248 



We assess the performance of the bootstrap inference procedure by comparing it to the 
bootstrap percentile inference procedure developed in Wang and Wang [22] . For brevity, 
we only report the result for Example 2 when a = 2, and the censoring rate is 40%, and 
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Table 3. Performance of the inference procedure. For Example 2, only the result for a — 2 and 
40% censoring is presented 



Ex. n 


T 


CQR 








Lcrq 








ECP 




EML 




ECP 




EML 




A) 


Pi 


Po 


Pi 


Po 


Pi 


Po 


Pi 


2 100 


0.5 


0.948 


0.930 


0.800 


1.541 


0.938 


0.950 


0.806 


1.564 




0.7 


0.938 


0.944 


0.862 


1.678 


0.956 


0.954 


0.883 


1.740 


200 


0.5 


0.928 


0.934 


0.533 


1.031 


0.946 


0.938 


0.586 


1.147 




0.7 


0.930 


0.932 


0.567 


1.102 


0.934 


0.948 


0.596 


1.174 


3 100 


0.5 


0.910 


0.930 


0.612 


1.195 


0.942 


0.924 


0.596 


1.196 




0.7 


0.916 


0.928 


1.037 


2.086 


0.928 


0.936 


0.956 


1.966 


200 


0.5 


0.916 


0.916 


0.423 


0.839 


0.954 


0.950 


0.424 


0.872 




0.7 


0.934 


0.928 


0.772 


1.490 


0.946 


0.944 


0.715 


1.445 



for Example 3. We record the empirical coverage probability (ECP) and the empirical 
mean length (EML) of the resulting confidence intervals in Table 3. The nominal level 
used is 0.95. For these two examples, both CQR and Lcrq give coverage probabilities 
close to the nominal level with comparable average empirical lengths. 

Since CQR relies on the local Kaplan-Meier estimate, it is of great interest to see 
how it performs when Z is multi-dimensional. To this end, we use the same model in 
Example 1, but add independent standard uniform covariates z^ 2 z^ d > to z. Thus, 
the coefficients associated with these additional covariates are zero. We use bandwidth 
0.05, 0.1, 0.2, 0.3, 0.4, respectively, when d = 1, 2, . . . , 5. In Table 4, we see that with 
n = 200, CQR seems to give unbiased estimates when d = 1, 2 and 3 with low censoring 
15%. However, it can only be applied up to d = 2 if censoring rate is as high as 40%. 

5. Data analysis 

As an example, we apply the proposed method to the Colorado Plateau uranium miners 
cohort data (Lubin et al. [14], Langholz and Goldstein [12]). The major interest of this 
study is to assess the effect of smoking on the rate of median lung cancer. This data set 
consists of 3347 Caucasian male miners who worked underground for at least one month 
in the uranium mines of the Colorado Plateau area. In total, there are 258 miners who 
died of lung cancer. Apart from the failure time, information of the age, the cumulative 
radon exposure and cumulative smoking in number of packs is available. In our study, 
we randomly choose 258 miners who are censored and all the miners who experience the 
lung cancer. We use this scheme to yield a median censoring scenario, suggested by the 
simulation studies. This data analysis means to illustrate the difference between different 
approaches. The scatter plots of the log survival time are presented at the top row of 
Figure 1. Let Z\ be the logarithm of the cumulative radon exposure in 100 working level 
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Table 4. Multi-dimensional covariates when n = 200. The standard errors (SE) are reported 
in parentheses. Note that the SEs of MAE and RMSE are computed via bootstrapping 1000 
replications 





Bias 




1\ 1" A TJ 1 








d c% 


Po 


Pi 


Po 


Pi 


Po 


Pi 


i *±yj /o 


— o m n 

vJ.UJ-U 


U.UOI 


n i ^9 


n 977 


0.220 






(U.ZZU ) 


i n a i o\ 
(U.412 ) 


(u.uuo ) 


{ c\ nnn\ 
(U.UUa ) 


(U.UUo J 


(U.UUa J 


15% 


0.001 


-0.010 


0.127 


0.227 


0.191 


0.342 




(0.191) 


(0.342) 


(0.006) 


(0.010) 


(0.005) 


(0.008) 


2 40% 


-0.070 


-0.260 


0.199 


0.338 


0.294 


0.493 




(0.286) 


(0.419) 


(0.009) 


(0.012) 


(0.007) 


(0.010) 


15% 


-0.027 


-0.063 


0.177 


0.245 


0.258 


0.350 




(0.256) 


(0.344) 


(0.006) 


(0.007) 


(0.006) 


(0.008) 


3 40% 


-0.100 


-0.339 


0.247 


0.378 


0.369 


0.535 




(0.355) 


(0.414) 


(0.009) 


(0.010) 


(0.008) 


(0.011) 


15% 


-0.032 


-0.096 


0.211 


0.236 


0.315 


0.353 




(0.314) 


(0.341) 


(0.008) 


(0.007) 


(0.007) 


(0.007) 


4 40% 


-0.087 


-0.414 


0.285 


0.452 


0.438 


0.602 




(0.430) 


(0.438) 


(0.011) 


(0.016) 


(0.011) 


(0.013) 


15% 


-0.029 


-0.130 


0.238 


0.229 


0.355 


0.359 




(0.354) 


(0.334) 


(0.011) 


(0.009) 


(0.007) 


(0.010) 


5 40% 


-0.089 


-0.441 


0.364 


0.465 


0.493 


0.616 




(0.485) 


(0.430) 


(0.013) 


(0.019) 


(0.011) 


(0.014) 


15% 


-0.048 


-0.125 


0.254 


0.245 


0.387 


0.367 




(0.384) 


(0.345) 


(0.009) 


(0.009) 


(0.008) 


(0.009) 


months, Zi 


be the cumulative smoking 


in 1000 packs and Z 3 


be the age at entry to 


the study. To explore the dependence of the log 


survival time 


against these covariates, 



we fit three separate marginal models using polynomial B-splines to approximate the 
effects of radon, age and smoking, respectively. In Figure 1, we plot the estimated log 
survival time against the three covariates at quantiles r = 0.01,0.05,0.1,0.3,0.5. Strong 
non-linearity is present, especially for lower quantiles. When r = 0.5, the log survival time 
is approximately linear. These facts suggest that the global linearity assumption may not 
hold. To further examine whether unconditional independence of the survival time and 
censoring time is appropriate, we fit the Cox model to the censoring time with respect to 
the covariates. The two covariates radon and age are both found significant from zero with 
p-values less than 10 -3 . This indicates that the unconditional independent assumption 
needed for YJW may be inappropriate for this data. Graphically, the dependence of the 
censoring time on the covariates can be seen from Figure 1, where Kaplan-Meier estimates 
of the survival functions, dichotomized by the median of these covariates, are plotted. 
From the figure, an observation is more likely to be censored at an earlier time if radon 
is high, age is young or the subject smoked less. Formal log-rank tests by dichotomizing 




Figure 1. Colorado miners cohort data. Top row: The scatter plots of the log survival time versus the covariates. The marginally 
fitted log survival times against each covariate at quantiles O.Of , 0.05, O.f , 0.3, 0.5 (the solid lines from the bottom to the top) are 
also plotted. Bottom row: The fitted Kaplan-Meier survival curves for the censoring time when covariates are dichotomized. 
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Table 5. The fitted coefficients of the censored quantile regression at the median and the 95% 
confidence intervals for the Colorado Plateau uranium miners cohort data 







CQR 




YJW 




Lcrq 




Radon ( 


xKT 2 ) 


-13.05 ( 


-28.85,-9.84) 


-2.95 { 


-12.4,3.23) 


-25.60 ( 


-34.45,-13.55) 


Smokin; 


; (xlCT 3 ) 


-0.38 ( 


-3.73,3.30) 


0.04 ( 


-5.19,3.58) 


0.65 ( 


-4.91,2.86) 


Age 




-1.60 ( 


-2.27,-1.28) 


-2.01 ( 


-2.61,-1.44) 


-2.12 ( 


-2.76,-1.39) 



the covariates also indicate that radon (p-value < 10 -3 ) and smoking (p- value 0.03) are 
highly correlated with the censoring time, while age (p- value 0.08) is not significant. Note 
that these log-rank tests only investigate these covariates marginally. 

Since we have three continuous covariates, we use the three-dimensional kernel after 
standardizing the covariates, which is the product of three bi-quadratic kernels for radon, 
smoking and age. We investigate the median log survival time on the three covariates. 
For Lcrq and CQR, we use the same bandwidth for the three univariate kernels and apply 
10-fold cross validation to choose the optimal bandwidth. The results are summarized 
in Table 5. It is seen that Ying et al. estimate age as the only significant variable, 
while CQR and Lcrq both estimate age and radon as significant variables. The result 
of Ying et al.'s approach in this example is problematic due to the dependence of the 
censoring time on the covariates. The 95% confidence intervals are obtained by using the 
bootstrap percentile approach (Wang and Wang [22]) using 1000 bootstrap repetitions. 
The fact that we only use a random sample for the censored data suggests that the 
results, especially the numerical ones, should be interpreted with certain caution. 

6. Conclusion 

We have proposed a novel extension of Ying, Jung and Wei's median regression to quan- 
tile regression. Our model is more flexible in that only conditional independence of the 
survival time and censoring time are assumed. Moreover, we have proposed a new and 
fast fitting algorithm, applicable to Ying et al.'s median regression model, making use 
of the efficient quantile regression code developed by Koenker. Therefore, resampling 
based inference procedure can be efficiently implemented. We have compared our esti- 
mator to the approaches developed in Portnoy [17], Peng and Huang [16] and Wang and 
Wang [22] . The simulation results show that the new method is useful and may have cer- 
tain advantages over the other methods, especially when the global linearity is violated 
or the unconditional independence of C and T does not hold. 

Idcnt inability remains a serious issue in censored quantile regression, particularly so 
when t is close to 1 or (Peng and Huang [16], Wang and Wang [22]). In practice, 
we recommend to choose r in the inference range of interest. Another limitation of 
the current method is the requirement of estimating Gq(-\Z), which inevitably suffers 
from the curse of dimensionality if Z is multi-dimensional. In this case, it may be more 
attractive to handle Gq(-\Z) by using, for example, the Cox model or the single-index 
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model. This line of research merits further investigation. Furthermore, the Kaplan-Meier 
estimates, even for a global one, may be unstable at the right tails. The technique in 
Zhou [27] may be used to improve the stability of these estimates. 

Appendix 

For convenience, we write ||/3|| as the Euclidean norm of a finite dimensional vector /3 and 
HG^Hoo as the supreme of the absolute value of a function G(-). First, we cite Theorem 
2.1 in Gonzalez and Cadarso [4]. 

Lemma A.l. Assume that conditions C4-C6 hold, then 

\\G- Golloo = supsup|G(t|z) - G (t\z)\ = O v ((\ogn) 1 ' 2 n - 1 /' 1+v ' 2 + n~ 2v ). 

t z 

A.l. Proof of Theorem 1 

Let M n (j3) = Y^i=i( T ~ F^{Z[p\Zi))Zi. It follows from the similar arguments as in Ying 
et al. [25] and Lemma A.l that 

Bupn-^MnGS)- M n {0)\=o{\) a.s. (A.l) 

From assumption C7, A n (f3) = ±^§^ = -££?=i ZiZ(f (Z^\Zi) is negative defi- 
nite with probability one for (3 in a small neighborhood of (3q. In addition, Af„(/?o) = 0. 
Therefore, n _1 M n (/3) is bounded away from zero. This argument, together with (A.l), 
yields that (3 n — > /3q in probability as n — > oo. 

A. 2. Proof of Theorem 2 

To prove Theorem 2, we exploit Theorem 2 in Chen et al. [2] by verifying their conditions 
(2.1)-(2.4), (2.5') and (2.6'). For convenience, write M n (/3,G) = ± VJ™ =1 mi (/3, G), where 

rrii{fi,G) = Zi{ I ^0'pyz\ ~ — r )j' anc ^ mnc tion class Q that involves the true Go 
as the set of G, such that G has a density function g, mi ze z G(7~\z) > t]q and g(-\z) 
is bounded away from infinity uniformly in t and z £ Z. Then M(/3,G) = Em,i(/3,G) = 

EZi{ 0—^ z'ffjz"/^^ — (•"■ — T )j'' wnere the expectation operator is taken with 
respect to the marginal distribution function of Zi and thus M(/3q,Gq) = 0. 

Lemma A. 2. For any positive value £„ =o(l), we have that 

sup \\M n (P,G)-M(P,G)-M n (l3 ,G )\\ =o p (nr 1 / 2 ). 
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Proof. Let 771 = sup zeZ \\z\\ 2 V 1 and ?y 2 = sup GegzeZt < T (f (t\z) + g(t\z)) < 00 from 
assumption C4. For any ((3,G) e B x Q and (J3*,G*) G B x Q, we have that ||m(/3,G) - 
m(,3*,G*)|| 2 < 2(C/i + ?7 2 + £/ 3 ) where 

C/i = ||ZG(Z'/3|Z)- 1 (/(y > Z'P) - I(Y > Z'P*)) || 2 
<V3\I(Y>Z'j3)-I(Y>Z'j3*)\, 

u 2 = \\zi(y > z' p*){G{z' p\zy l - g\z' ^zy^f < m \\G- GIL 
u 3 = \\zi(y > z'p*){G*{z'p\zy l - G*(z'p*\zy 1 )\\ 2 < V5 \\p- pf, 



(A.2) 



where 773, 774, 775 are some positive constants, only depending on 77^ (k = 0, 1, 2). It follows 
from (A.2) that £ , (sup|| / 3_ (g *|| < ^ i U\) < 771772773^ and that 

sup p/(/3,G)-A/(/r,G*)|| 2 <7 ?6 £„ (A.3) 

l|j8-/9o||<en,||G-G ||»<C„ 

for some constant % > as n is sufficiently large. 

Therefore, condition (3.2) of Chen et al. [2] holds with r = 2 and Sj = 1/2. Similarly 
to the arguments used in (A.2), condition (3.1) in Chen et al. [2] can be also verified. 
Now we verify their condition (3.3). Let N(r),G, \\ ■ ||oo) be the covering numbers (van 
der Vaart and Wcllncr [21], page 83) for the function class Q under the metrics || ■ ||oo. 
An application of Theorem 2.7.1 in van der Vaart and Wellner [21] from assumptions C4 
and C2 gives that the logarithm of the covering number of Q is bounded by Kr\~ x l 2 for 
77 < 1, where K is some constant, not depending on n. When 77 > 1, it follows from the 
definition of covering numbers that log N(j], Q, \\ • ||oo) = 0, which yields that 

{logA^^IHloo)} 1 /^^ / i^V^d^oc. 

Jo 

It then follows easily from Theorem 3 of Chen et al. [2] that Lemma A.2 holds. □ 

To apply Theorem 2 in Chen et al. [2], we define Fi(/?o,Go) as the first derivative 
function of M(/3, Go) with respect to /3 evaluated at f3 = /?o- For all /? S B, we define the 
functional derivative of M ((3, G) at Go in the direction [G — Go] as 

r 2 (/3, G )[G - Go] = lim - [M(/3, G + 7 ? (G - G )) - M(p, G )]. 
v^o 77 

Lemma A.3. Assume that the conditions in Theorem 2 hold, then 

n 1 ' 2 (M n (Aj , Go) + T 2 (/3 , Go) [G - G ] ) 4 N(0, V) , 
where V = cov(Vi) with Vi = mi(0 o ,G o ) - (1 - T)Zif z (Zi)ij;(Yi,6i, Z-0o,Zi), and 
Y * M -g (s\z)ds 0—Si)KYi<t) 



ip(Yi,5i,t,z) 



{G (s\z)} 2 {l-Fo(s\z)} G (Yi\z){l-F Q (Yi\z)}- 
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Proof. By the definition of r 2 , a direct calculation gives that 

r 2 (A),G )[G - G ] = -(1 - T )EZ{G(Z%\Z) - G (Z'p \Z)}/G (Z'p \Z). (A.4) 

From Theorem 2.3 of Gonzalez-Manteiga and Cadarso-Suarez [4] and the proof of 
Theorem 2 in Wang and Wang [22], using assumptions C3-C7, we have that 

G{t\z)-G (t\z) = ^^K(^^\G {t\z)^{Y i ,5 i ,t,z) 

(A.5) 



o P i[^y /4 + /, 



ih 



Plugging (A.5) into (A.4), using standard change of variables and Taylor expansion ar- 
gumcnts, we obtain that r 2 (/3 , G )[G-G ] = -(l-r)£ £™ =1 Z l fz(Z l )ij(Y l ,S ll Z' l l3 0l Z t ) -4 

o p {n- 1 ' 2 ). Therefore, we have n 1 / 2 ( M n (/3 , Go) + T 2 (fio , G ) [G - Go] ) - n~ 1/2 Eti K + 
o p (l). An application of the central limit theorem gives that 

n 1/2 (M„(/3 , Go) + r 2 (/3 , G )[G - G ]) 4 N(Q, V). 
This proves the lemma. □ 

Proof of Theorem 2. We verify the conditions in Theorem 2 in Chen et al. [2]. Their 
condition (2.1) can be easily verified by the subgradient condition of quantile regression 
(Kocnker [8]). Their conditions (2.4), (2.5') and (2.6) follow directly from Lemma A. 1, A. 2 
and A. 3, respectively. From the definition of Ti, we obtain that 



r 1 = r 1 ( A , G „) = 3M C- G ») 



dp 



= -EZZ'f (Z , f3 \Z), 

P=Po 



which is negative definite by assumption C7. Thus condition (2.2) in Chen et al. [2] 
holds. By the routine Taylor expansion, we can verify condition (2.3) in Chen et al. [2]. 

Therefore, we obtain that n 1 / 2 (/3„ - fa) 4 iV(0,ri(/?o,Go)" 1 V r ri(^o,Go)" 1 ). The proof 
is complete. □ 
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